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We develop a graphical interpretation of ternary probabilistic forecasts in which 
forecasts and observations are regarded as points inside a triangle. Within the 
triangle, we define a continuous colour palette in which hue and colour saturation are 
defined with reference to the observed climatology. In contrast to current methods, 
forecast maps created with this colour scheme convey all of the information present 
in each ternary forecast. 

The geometrical interpretation is then extended to verification under quadratic 
Ph ' scoring rules (of which the Brier Score and the Ranked Probability Score are well- 

. known examples). Each scoring rule defines an associated triangle in which the 

square roots of the score, the reliability, the uncertainty and the resolution all have 
natural interpretations as root-mean-square distances. This leads to our proposal 
c/2 ' for a Ternary Reliability Diagram in which data relating to verification and cali- 

bration can be summarised. 
£C) ' We illustrate these ideas with data relating to seasonal forecasting of precip- 

^ . itation in South America, including an example of nonlinear forecast calibration. 

Codes implementing these ideas have been produced using the statistical software 
package R and are available from the authors. 
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1. Introduction 
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Forecasts are often given in probabilistic terms. For example: ' There is a 60% 
chance that rainfall before 12h00 tomorrow will exceed 15mm' ; l The odds for the 
football match are 6-to-l against a home win and even money for a draw' ; 1 The 
climate model predicts that mean daytime temperature in summer 2100 will be nor- 
mally distributed with mean \i and variance a 1 '. In each of these examples the 
forecast consists of a probability distribution. The sample spaces for the observable 
events of interest consist of two elements (exceed/not exceed), three elements (home 
win/draw/away win) and the real line M. Consequently, these examples relate to 
binary, ternary and continuous probabilistic forecasts respectively. 

f Author for correspondence (t.c.jupp@ex.ac.uk) 
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This paper is concerened with two types of visualisation. In physical space, vi- 
sualisation constitutes colour maps of probabilistic forecasts over a geographical 
region. In probability space, visualisation will constitute a geometrical representa- 
tion of forecasts and observations in our proposed Ternary Reliability Diagram. 

For binary forecasts, visualisation tools such as the reliability diagram, the 
sharpn ess diagram and the relative operating characteristic (ROC) curve are well 
known IJolliffe fc Stephenson! (120031) . Our aim in this paper is to develop analogues 
of these ideas for ternary forecasts and to develop a geometrical intuition for veri- 
fication and recalibration. 

The structure of this paper is as follows. fj2] contains a discussion of probabilistic 
forecasting, and illustrates how a continuous forecast distribution can be identified 
with a ternary forecast. <JJ] then introduces the idea of representing each ternary 
forecast as a point in barycentric coordinates, and hence assigning to it a unique 
colour. In <2]the quality of a forecasting system is quantified using quadratic scoring 
rules. This is then used in $5] to propose an algorithm for recalibrating probabilis- 
tic forecasts which is illustrated with seasonal forecasting data. Conclusions are 
presented in SjBl 



2. Probabilistic forecasting 

The generic situation to be considered is as follows: a forecasting system (perhaps 
a suite of climate models with different initial conditions) can be used to produce 
probabilistic forecasts (that is, probability distributions) for some variable of inter- 
est at a number of points within a geographical area. The subsequent challenge is 
to display in map form as much as possible of the information contained in this spa- 
tial array of probability distributions. The most natural thing to do, perhaps, is to 
associate with each forecast distribution a set of scalars (such as the distribution's 
moments or a selection of its quantiles) and produce a separate map for each scalar. 
The disadvantage of this procedure is that multiple maps are needed to convey the 
information contained in the forecast distributions. If information on the skill of 
the forecasting system were available, yet another map would be required. 

Consider probabilistic forecasting of a scalar climate variable x such as temper- 
ature or precipitation. The forecasting system produces a cumulative distribution 
function (CDF) F(x) at each spatial location (Figured]). The CDF expresses the 
forecast probability that the variable of interest will not exceed x. The forecast 
distribution F{x) can be interpreted as the forecasting system's state of knowledge 
about the future value of the variable x. In contrast to a deterministic forecast Xf 
(which is a scalar), a probabilistic forecast F(x) is (in principle at least) an ana- 
lytic function and hence an infinite-dimensional object. For this reason, it is not 
immediately obvious how to visualise spatial probabilistic forecasts. 

Often, the variable x has associated with it an observed climatology represented 
by the CDF G(x) (Figure [1]). The climatology can be interpreted as the historical 
distribution of the observable quantity. Typically a climatology is calculated by 
aggregating data from an observational record over at least the past 30 years. 

It is important to note that the climatology G{x) and the forecast F(x) are both 
continuous probability distributions and so it is quite possible for the forecasting 
system to issue the climatology G(x) as its forecast. This is a perfectly valid prob- 
abilistic forecast but not, perhaps, a very useful one. It is usually hoped that the 
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Forecast F(x) and climatology G(x) Ternary climatology q 

: ^^^3 M q 




Figure 1. Continuous distributions functions F(x) (forecast) and G(x) (climatology) can 
be associated with ternary forecasts p and q by defining three categories B, N and A - 
'below-', 'near-' and 'above-' normal (Equations 12. II and 12, 2[) . 

forecasting system's state of knowledge about the future extends beyond knowledge 
of the climatology alone. For this reason, it will be helpful to view the climatology 
as a benchmark distribution with which all other forecasts should be compared. 
Thus, in the discussion below, a colour will be assigned to the forecast F(x) by 
considering the 'distance' between the forecast F(x) and the climatology G(x). 



(a) Ternary probabilistic forecasts 

In order to reduce the dimensionality of the problem it is useful to project the 
continuous distributions G(x) and F(x) onto ternary distributions q' = (qB,QN, Qa) 
and p' = (pb,Pn,Pa) (Figure [T]). We adopt the convention that vectors are column 
vectors and that ' denotes a transpose. The real line is divided into three ordered 
categories B = (— oo,xb], N — [xb,xa] and A = [0:^,00) whose labels have the 
following rationale: B - 'below normal', N - 'near normal' and A - 'above normal'. 

The ternary distributions q(G) and p(F) encode the probabilities with which 
the variable is forecast to lie in the categories B, N and A (Figure [1]), and so all of 
their elements are non-negative with qs + qN + QA = Pb + Pn + Pa = 1- 

For given quantiles xb and xa it follows that the ternary climatology q and 
ternary forecast p are given by: 

q' = {qB,qN,qA) = (G(x B ),G(x A ) - G(x B ),l - G{x A )) . . 
p' = (p B ,PN,p A ) = {F(x B ),F(x A )-F(x B ),l-F(x A )) { ' J 

An equivalent interpretation is that for a specified ternary climatology q the 
categories B, N and A are defined by the quantiles: 

x B = G- 1 {q B ); x A = G- 1 {q B + q N ) (2.2) 

A natural choice for the ternary climatology is the uniform distribution q' = 
(5, 5, 5)- In this case - which we shall regard as the default - the categories B, 
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N and A are defined by the terciles of the climatology G(x) and observations over 
a long period would be expected to lie with equal frequency in each of the three 
categories. It should be stressed, however, that any choice of quantiles can be made 
when defining the ternary climatology. For example, the choice q' = ( j, 3, j) would 
imply that categories B and A had been chosen to be the lower and upper quartiles 
of the climatology. 



3. Barycentric coordinates 

In this section, a geometrical representation of a ternary forecast p, ternary clima- 
tology q and ternary observation o is considered. 



(a) Scoring functions 

Scoring functions quantify the past skill of a forecasting system. The idea is to 
compare a ternary forecast p with the corresponding ternary observation o made 
after the event. Since the categories are exclusive and complete, it follows that a 
ternary observation o can take one of three values o' B — (1,0,0), o' N = (0,1,0) and 
o' A = (0, 0, 1) according to whether the observable is found to lie in category B, N 
or A. 

The skill of a forecasting system can be quantified for a set of forecast-observation 
pairs by a score function S. The score is a measure of the difference between fore- 
casts and observations. It follows that the lower the score, the more skilful the 
forecast. 

In this paper, attention is restricted to scores defined by quadratic forms 



S = (p - o)'L'L(p - o) (3.1) 

where the 3-by-3 matrix L defines the particular scoring rule being used (Appendix, 
section [g), L'L is assumed to be positive definite and the overbar denotes an aver- 
age over all forecast-observation pairs. The Brier Score (Appendix, section mj) and 

the Ranked Pr obability Sco r e (Appendix, s ection IB are w e ll-known examples of 

quadr a tic scoreslBrierl (fl950h:lEpsteml (ll969MMurphvl (<1969l ): lMurphv fc Stael von Holste"m] 

~~ hyl 



(1975); IStael von Holstein k Murphy (1978) 



The scoring matrix L can be used to define a 2-by-3 matrix M (Appendix, 
equation IA4[) which maps a ternary forecast p G R 3 to a corresponding point 
P = A/p 6 R 2 in the plane. This transformation maps ternary forecasts to points 
within a triangle and so each scoring matrix L induces a corresponding triangle 
in M 2 . In this paper, the matrix M is defined so that a score 5* corresponds to a 
(mean) squared distance between forecasts P and observations O (considered as 
points inside an appropriate triangle): 



S=\\P-0\\ 2 (3.2) 

The corners of this triangle are the points Ob = Mob, On = Mom and 
Oa — Mo a associated with the three possible values of a ternary observation. 
The point Q = Mq associated with the climatology lies within the triangle. The 
default choice for the scoring matrix is that associated with the Brier Score, in 
which case the associated triangle is an equilateral triangle with unit sides. In the 
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contour interval: 0.05 contour interval: 0.05 



Figure 2. Visualisation of ternary forecasts and observations in barycentric coordinates 
using the categories B, N and A denned in Figure [1] (a) Observations O can take one of 
three values and lie at the corners. The angle 9 can be used to compare an arbitrary forecast 
P with the climatology Q. (b) A simple way to assign colours to ternary forecasts is to 
consider the most likely category, (c) For the Brier score, the root-score y/S corresponds to 
distance in an equilateral triangle. Contours of \/S are shown for o = ob (solid), o = ojv 
(dashed), o = oa (dotted), (d) as (c) but for the right-angled triangle induced by the 
Ranked Probability Score. 

case of the default climatology q' = (|, |, |), the point Q lies at the centre of this 
triangle (Figure This sort of visualisation receives different names in different 
disciplines. In mathematics, it is known as a plot in barycentric coordinates while 
in the applied sciences it is known as a ternary phase diagram. 

Barycentric coordinates yield an intuitive geometrical interpretation of ternary 
forecasts. As well as allowing scores to be visualised as squared distances they allow 
any scheme for colour assignment to be visualised as a triangular colour palette. 
For example, Figure [2h> illustrates the algorithm which assigns colours according to 
which of the three possible outcomes is considered in the forecast to be the most 
probable. Here and subsequently it is assumed for illustration that the forecast 



Article submitted to Royal Society 



T.E. Jupp et al. 




variable x is precipitation. It therefore seems sensible to assign the 'dry' colour 
red to category B and the 'wet' colour blue to category A. Note that the ternary 
forecasts p' = (1, 0, 0) and p' = (0.34, 0.33, 0.33) would both be assigned the colour 
red in Figure [2] even though they differ greatly. In the former case the forecasting 
system is certain that x will lie in category B while in the latter case the forecast 
is barely different from the climatology q' = (|, |, |). 

Since scores are squares of distances within the triangle, it is helpful to consider 
the root-score y/S. Figure shows contours of constant root-score in the case 
of the Brier score, when the observation o takes each of its three possible values. 
These contours show the set of forecasts which are equally good (in terms of score) 
as predictions of the subsequently observed value. Figure Of shows contours of 
constant root-score in the case of the Ranked Probability Score, for which the 
induced triangle is a right-angled triangle with sides 1 / V2, 1 / v2 and 1 (Appendix, 
section [b]). 



(6) Current colour schemes for ternary forecasts 

Current methods for assigning colours to ternary forecasts discretise barycen- 
tric coordinates into a finite number of regions, meani ng that informa t ion pr esent 



in the forecast is not present in the assigned colour Slingsbv et al. ( 2009() . For 



example, consider the 'most likely tercile' scheme of Figure [2b used for display- 
ing precipitation forecasts produced in the EURO-BRazilian Initiative for improv- 
ing South American seasonal forecasts (EUROBRISA) project Precipitation fore- 

f http://eurobrisa.cptec.inpe.br 
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casts produced in EUROBRISA are Gaussian distributions obtai ned from a calibra- 



tion an d combination procedure known as forecast assimilation iStephenson et al 



(|2005afl . These 'integrated' forecasts are a com bination of four coupl ed ocean- 
atmos phere general c i rculat ion models fEC MWEl Anderson et al. j2007 ). UK Met 
Office iGraham et all (|2005h . Meteo France iGueremv et all (l2005h and CPTEC 
Nobre et all 1200911 a nd an empirical model that uses Pacific and Atlantic sea 



surfac e temperatures as predictors for South American precipitation ICoelho et al 

A feature of current visualisation methods is that the algorithms by which 
colours are assigned tend to be described algebraically. In the example considered 
here, colours are assigned to forecasts based on the following algebraic definitions 
of 5 regions of forecast space: 

1 (Dry): (p B > § and p N < \ and p A < -j)- 

2 (Dry or normal): (ps > \ and pn > §) or (ps > § and pn > j)- 

3 (Normal): {pb < | and pn > § and pa < |)- 

4 (Wet or normal): (p^ > | and pa > f ) or (p^ > § and pa > |)- 

5 (Wet): (pb < § and pn <\ and pa > §)■ 

The use of barycentric coordinates allows the meaning of these definitions to 
be visualised easily (Figure [2]) ■ It also reveals something that is not immediately 
obvious from the algebra. There is a region in barycentric coordinates (here coloured 
grey) that is not included in these definitions. This region of ternary forecast space 
- at the base of the triangle - corresponds to ternary forecasts in which category 
N is assigned a low probability, but the outlying categories B and A are assigned 
relatively high probability. Such forecasts can arise when the forecast distribution 
F(x) has variance much greater than that of the climatology G(x) (§ej). 

In summary, many current methods for assigning colours to forecasts lose infor- 
mation by discretising ternary forecast space. The same colour is assigned to more 
than one ternary forecast and the colours do not convey a sense of how much the 
forecast differs from climatology. A ternary forecast close to climatology provides 
little gain in information - the forecast has not told us much that we did not already 
know. On the other hand, a ternary forecast far from climatology provides a large 
gain in information. In such a case the forecasting system assigns high probability 
to the variable lying in one particular category and not the others. 

The aim now, therefore, is to assign a continuum of colours to ternary forecasts, 
viewed as points in barycentric coordinates. This will retain all information present 
in the ternary forecast and so it will be possible to identify a ternary forecast 
uniquely from its assigned colour. The choice of colour assignment will, in particular, 
take account of the information gain in each ternary forecast. 



(c) Comparing forecasts with the climatology 



Probabilistic forecast s are r egarded here as measures of belief and so a measure 
based on entropy Javnesl ( 2003 ) is preferred. A sensible measure of information gain 
£(p;q) is: 



£(p;q) 



l 

log max 



(3-3) 



ie{B,N,A} 



Article submitted to Royal Society 



8 



T.E. Jupp et al. 



E(p;q) [dashed], G(p;q) [solid] E(p;q) [dashed], 6(p;q) [solid] 




q' = (0.33, 0.33, 0.33) q' = (0.10, 0.20, 0.70) 

Figure 4. (a) Proposed coordinate system in the case when q' = (g, |, |). Information gain 
S(p;q) (equation 13 . 3 p [dashed lines] ranges from (at the centre) to 1 (at the corners). 
Dominant category 0(p; q)/27r [solid lines] ranges from to 1 moving clockwise from thick 
line, (b) as (a) but with q' = (0.1,0.2,0.7). 

This can be interpreted as a scaled version of the Kullback-Leibler divergence 
between p and q. Contours of constant E(p; q) are plotted in the unit triangle 
in Figure |3J Note that the climatology q corresponds to an information gain of 
(i?(q; q) = 0) and that the corner furthest from the climatology corresponds to an 
information gain of 1 (E(o; q) = 1 for some o). 

Motivated by the idea of assigning colours according to the most likely tercile, 
the continuous angular measure 0(jp; q) will be referred to as the dominant category. 
This measures the angle in barycentric coordinates of the forecast p with respect 
to an origin at the climatology q. It follows that the information gain E(p; q) and 
the dominant category 9{p; q) define an alternative coordinate system for ternary 
forecast space (Figured]). 

(d) A new colour scheme for ternary forecasts 

In the red-green-blue (RGB) representation of colour (used in colour televi- 
sions), colours are represented by sets of three numbers in the range [0,1] corre- 
sponding to the brightness of the three primary colours. Thus RGB — (1,0,0) is 
bright red, RGB — (0, 1,0) is bright green, and so on. In this paper, colours are 
assigned using the more intuitive hue-saturation-value (HSV) representation. Ge- 
ometrically, the RGB system defines a unit cube in colour space in which shades of 
grey occur on a grey-line running from the black corner RGB = (0, 0, 0) to the white 
corner RGB — (1, 1, 1). The HSV system describes colours as points in a cylindri- 
cal coordinate system with this grey-line as its axis. In this system, value € [0, 1] is 
a measure of distance parallel to the grey-line axis, saturation £ [0, 1] is a measure 
of distance perpendicular to the axis and hue € [0, 1] is an angular measure around 
this axis. 

Here, a colour is assigned to the forecast p by associating the dominant category 
6(p; q) with the hue and the information gain E(p; q) with the saturation. The 
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Figure 5. The functions of equation l3.4l (a) Hue as a function of dominant category 8(p: q). 
(b) Saturation as a function of information gain E(p;q). m = 0.7 and 6o = are used 
throughout this paper. 



proposed algorithm for colour assignment is 



hue = h([(0(p; q) - 6 ) mod 27r]/27r) 

saturation = (E(p;q)) m (3.4) 

value = 1 



where the functional forms h(6) and s(E) — E m are illustrated in Figure [5] Sug- 
gested default choices for the parameters are m = 0.7 and 9 = 0, which together 
yield the colour palette shown in Figure El Note that the climatology is always 
assigned the colour white by this system. The nonlinear hue function h(9) has been 
chosen to minimise the region of barycentric coordinates that is assigned the colour 
green and hence to minimise difficulty for rea ders with green-weak colour blindness 



ity tor rea c 

Light k Bartleinl (|2004D ; IStephenson] (|2005blh The parameter 9q can be chosen in 



order to rotate the palette in the triangle about the climatology, while the exponent 
m controls the rate at which colour saturation changes away from the climatology. 

An example of a forecast map using the proposed colour scheme is shown in 
Figure [7] In this illustration, the data consist of integrated seasonal precipitation 
forecasts for South America produced by the EUROBRISA project. At each spa- 
tial location, the probabilistic forecast is compared with the terciles of the local 
climatology. In other words, the ternary climatology q' = (shown by a 

cross in the colour palette bottom-right) has been used as the benchmark ternary 
forecast. The map shows clearly that high probability is assigned to low rainfall in 
northern Brazil (red) while there are regions in Southeast Brazil and the Southern 
Ocean for which the forecast barely differs from climatology (white). There are also 
regions where rainfall in the near-normal category is forecast with high probability 
(yellow) and low probability (purple). 
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Palette when e = 0° , m = 0.7 

N 




q' = (0.33, 0.33, 0.33) 



Palette when e = 0° , m = 0.7 

ft , 





q' = (0.33, 0.33, 0.33) 



Figure 6. (a) Palette of colours assigned to ternary forecasts (equation 13. 4[) . Climatology 
q indicated by blue cross, (b) as (a) but with barplots indicating ternary forecasts p. 



Special case: Gaussian Distributions 



To gain insight into the colours assigned by equation l3.41 it is helpful to consider 
the special case in which the climatology G(x) and the forecast F(x) are both 
Gaussian distributions. The space of Gaussian distributions is two-dimensional 
(each distribution defined by its mean and variance) and so there is a natural one- 
to-one mapping between a (Gaussian) forecast F(x) and its ternary representation 
P 

Specifically, suppose that the climatology is N(/i c , of) and the forecast is iV(/i, a 2 ) 
It follows that 

'x — fi c \ . ?h ( x ~ 



G(x) = $ 



F(x) = $ 



(3.5) 



where $>(z) is the CDF of the standard normal distribution N(0,1). It is helpful 
to normalise the forecast mean and standard deviation with respect to those of the 
climatology 

fi = ; cr = — (3.6) 

It follows that the forecast is more sharply peaked than the climatology when a < 1 
and less sharply peaked than the climatology when a > 1. The ternary forecast p 
can be calculated from eauation l2.1l bv noting that 



F{x B ) = $ 



* 1 (qB)-fi 



F(x A ) = $ 



$ 1 (<1b + In) — A 



(3.7) 



Figure [S^i illustrates the colours that are assigned to a forecast and climatology 
when both are Gaussian. When the forecast is much more sharply peaked than 
the climatology (ff < 1) the ternary forecast assigns nearly all of its probability 
mass to one of the three categories. When the forecast mean is much less than 
the climatological mean (fx <C 0) this probability is assigned to category B and 
the assigned colour is a strong red. Similarly a strong yellow is assigned when the 
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Ternary precipitation forecast 
Issued: Nov 1997 Valid: DJF 1997 




Figure 7. A probabilistic forecast map produced using the colour scheme proposed here 
( equation I3.4|) . The continuous colour palette (bottom right) conveys the probabilities 
assigned to assigned to 'Below-', 'Near-' and 'Above-' normal precipitation categories 
(Figure [TJ. For example, colours close to white indicate a forecast similar to climatology, 
strong red indicates a high probability of below-normal precipitation, and strong blue a 
high probability of above-normal precipitation. 

means are comparable [fi 0) and a strong blue is assigned when the forecast mean 
significantly exceeds the climatological mean (fj, ^> 0) 

The situation is rather different when the forecast has much higher variance than 
the climatology (a 1). In this case the ternary forecast assigns little probability 
mass to the central category N but rather splits it approximately evenly between 
the two extreme categories. In this case the assigned colour is purple. This sort 
of situation might arise in an ensemble climate forecast in which some members 
of the ensemble predict that the climate variable increases while other members 
predict that it decreases. The purple colour might also arise if the forecasting sys- 
tem predicted an increase in the frequency of extreme events in both tails of the 
distribution. 

Figure [5}d shows (l and a in bary centric coordinates, and hence the Gaussian 
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Colours when F(x) ~ N((i,<T), G(x) ~ N(u c ,of; 



(i(p;q) [dashed], o(p;q) [solid] 




-1.0 -0.5 0.0 0.5 

scaled mean: \i-{\i-\i. c )j<5 c 




q' = (0.33, 0.33, 0.33) 



Figure 8. (a) Colours assigned when forecast F(x) and climatology G(x) are both Gaussian 
distributions. The case where the forecast equals the climatology is indicated by a cross, 
(b) Contours of scaled mean /} G {0, ±0.2, ±0.5, ±1, ±2, ±5} and scaled standard deviation 
o £ {0.2,0.5, 1,2,5} in barycentric coordinates ( equation 13. 6[) . 



distributions of climatology and forecast that would yield given a ternary forecast p. 
Consider the case in which the forecast and the climatology have similar variances, 
as appears often to be the case. It follows that a ~ 1 and so the ternary forecast 
p lies close to the contour a ~ 1 in Figure [SJd. The forecast mean /x then controls 
the colour assigned to the forecast, with red assigned when fi <C 1, white assigned 
when ft ss 1 and blue assigned when ft 3> 1. 



4. Verification of ternary probabilistic forecasts 



The previous sections have considered the visualisation of a set of probabilistic fore- 
casts. The aim in this section is to visualise the difference between a set of forecasts 
and the corresponding set of observations. This is known as forecast verification 
Jolliff e fc Stephenson (2003). In the next section the standard score decomposition 



of iMurphvl (Il973ll is reviewed and re-interpreted geometrically. 



(a) Decomposition of the score 



Suppose that the scoring matrix L has been specified so that the score of an in- 
dividual forecast-observation pair is given by a squared distance in the appropriate 
triangle (equation [XI]). As in previous sections, the Brier score is the default choice 
but the results hold for any quadratic scoring function. 

The triangle representing ternary forecast space can be discretised into a finite 
number of bins with centres P&. For simplicity, each fo recast P that lies in the bi n 
with cent re Pfc wil l be re assigned the central value Pfc Doblas-Reves et al. ( 2008t) . 
Following I Murphy consider the identity 



O = (P fc - 0|P fe ) - (Q - 0|P fe ) + (Q - O) 



(4.1) 
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where 0|Pfc is the mean observation associated with forecasts in bin P/.. If the 
climatology is defined to be the mean of all the observations Q = O, it follows that 
a decomposition of the mean score with binned forecasts is: 



|P fc -0|P = ||Q -OP - ||Q-0|PJ|2 + ||P fc -0|PN2 



score 



fell II- 1 fc - "I 1 fel 

uncertainty — resolution + reliability (4-2) 



S = U Z + R 



Equation 14. 21 constitutes the stand ard decomposition of the scor e 5* into uncertainty 
U, reliability R and resolution Z I Jolliffe fc Stephenson! (|2003l ). It is important to 
stress that this decompo sition applies only when all forecasts within a bin are 
assigned the central value IStephenson et 



It has already been remarked that the score is the mean square distance in the 
triangle between forecasts and observations, and it is clear that the uncertainty is 
the mean square distance between observations and climatology. In order to see how 
uncertainty varies with climatology q (Figure [S]), it is helpful (Appendix, section [Cj) 
to consider the particular climatology qo (shown by a dot in Figure [S]) for which the 
uncertainty gains its maximum value Uq = [/(qo). The uncertainty U(q) is then 

U = U Q - AU (4.3) 

where At/(q) represents the reduction in uncertainty between qo and q. Figure 
O illustrates that the root-uncertainty-reduction V AU can be visualised as the 
distance in the triangle between the climatology of interest q and the climatology 
qo of maximum uncertainty. 

Similarly, the resolution in equation 14.21 is the mean square distance between 
climatology Q and the mean observations conditional on the forecasts 0|P fe . The 
reliability is the mean square distance between forecasts P& and the mean obser- 
vations conditional on the forecasts 0|P fc . 

It follows that the root-score y/S is the root-mean-square (rms) distance be- 
tween forecasts and observations, the root-uncertainty \/U is the rms distance 



Article submitted to Royal Society 



14 



T.E. Jupp et al. 




Figure 10. Geometrical interpretation of the decomposition of a quadratic score S into 
uncertainty U, resolution Z and reliability R feauation l4.2|) . whose square-roots represent 
distances in the triangle. A proposed 'decomposition diagram' is shown at top-left. 



between climatology and observations, the root-resolution \fZ is the rms distance 
between climatology and mean conditional observations and the root-reliability \f~R 
is the rms distance between forecasts and mean conditional observations. These ge- 
ometrical interpretations as rms distances are illustrated in the triangle in Figure 

The aim of verification is to quantify the root-score \J~S (Figure [TU1 black lines) 
and the aim of recalibration is to minimise the root-score. Since the forecaster 
cannot change the observations (Figure [TUJ red circles), recalibration proceeds by 
changing the forecasts Pfc (Figure HU1 black circles) in order to minimise the root- 
reliability VH (Figure [TU1 red lines). 

The diagram at the top left of Figure [TU] contains a proposed graphical inter- 
pretation of the score decomposition. The root-uncertainty y/U is independent of 
the forecasts and defines a semi-circle (grey) of diameter \/\J (blue). The root- 
resolution \[Z is also independent of the value of the forecasts (provided the 
'binning' of the forecasts remains unchanged) and defines the larger of the two 
right-angled triangles. This triangle has sides of length \/Tj (blue), \[Z (green) and 
\JU — Z (purple). It follows from Pythagoras' theorem that all three corners of this 
triangle lie on the semi-circle, with two on the diameter. The smaller of the two 
right-angled triangles contains components of the score decomposition that can be 
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°->8 S 



Pk-0|Pk 




q'=(0.33, 0.33, 0.33) 
threshold = 10 5p=1/11 



threshold = 10 5p = 1/1 



Figure 11. Examples of proposed 'ternary reliability diagrams' (a) A set of poorly cali- 
brated 'original' forecasts has poor reliability (long red dipoles). (b) Recalibration of these 
forecasts by 'moving the black dots' improves reliability (shorter red dipoles). Decompo- 
sition diagram (top left) shows that recalibrated score is close to the best achievable by 
recalibration. Sharpness diagram (top right) illustrates the recalibration geometrically. 

altered by recalibration of the forecasts. The smaller triangle has sides of length y/S 
(black), \/R (red) and \JU — Z (purple) and equation 14.21 constitutes Pythagoras' 
theorem. The aim of recalibration is to minimise the root-score y/S by minimis- 
ing the root-reliability y/R. It follows that the dashed lines in the decomposition 
diagram represent limiting values of the root-score y/~S. A forecasting system that 
always issued the climatology as its forecast would have resolution Z — and hence 
have a root-score V~S indicated by the blue dashed line. This represents the worst 
performance that might be expected from a skilful forecasting system. On the other 
hand, the best possible performance is indicated by the purple dashed line. This 
would occur for a forecasting system with perfect root-reliability \/~R = 0. 



The ideas of the preceding section can now be illustrated with an example. The 
data in this example are seasonal precipitation hindcasts at one month lead-time 
from the EUROBPJSA integrated forecast model. The data relate to the years 1981 

- 2005, the season January-February-March (JFM) and a spatial region 72.5°W — 
42.5°W, 12.5°S — 2.5°N for which seasonal forecasts show reasonable skill. 

Figure [TTk contains one of our proposed ternary reliability diagrams. The large 
triangle contains information on the reliability of the forecasts. Bins of size Sp = jj 
were chosen and the central forecast in each bin Pfc - to which all forecasts in that 
bin are set equal - is plotted in the triangle as a black circle. The mean observation 
conditional on a forecast 0\P k is plotted as a red circle and the two are joined by 
a red line line in order to form a 'dipole'. The length of each dipole represents the 
root-reliability of the forecasts in each bin. The number of forecasts in each bin is 
shown graphically by the ternary sharpness diagram at the top-right. (Dark colours 

- high density, light colours - low density, grey - no data). The climatology is shown 



(b) Ternary reliability diagrams 
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by a blue cross. Dipoles for bins containing fewer than 10 forecasts (indicated in the 
diagram by the text 'threshold =10') are omitted for clarity. An advantage of this 
sort of visualisation of the data is that coherent patterns in the reliability dipoles 
are immediately obvious. Such patterns would indicate regions of ternary forecast 
space in which the forecasting system consistently assigned the wrong probability 
to one of the three categories. 

The decomposition diagram (top-left) illustrates geometrically the decomposi- 
tion of the root-score in this dataset. Thus, the root-score y/S = 0.569 is seen to be 
rather poor compared to the best-possible root-score y/U — Z = 0.547 that could 
be attained through recalibration. 



The map of Figure[7]contains information about forecasts but not about the skill 
of the forecasting system. In order to incorporate verification data into a forecast 
map, the array of coloured squares can be replaced by an array of coloured circles 
whose radii are a measure of forecast skill in a verification data set. One possibility 
would be to set radius proportional to (Z — R)/Z which is a 'standard' definition 
of skill. In this paper, however, the convention has been to consider quantities like 
\[Z and y/~R which can be interpreted as distances in a triangle rather than squared 
distances (Figure fTUj) and so (yZ—\/K)/\fZ is preferred here. (It is straightforward 
to relate these two possibilities via the identity (Z — R) j Z = 1 — (1 — (\J~Z~ — 
VR)/VZ) 2 .) Making the choice: 



it follows that larger circles will be plotted in regions of high skill, and no circle 
will be plotted if the forecasting system has less skill than a climatological forecast 
(i.e. when the assigned radius is negative). A map produced in this way is shown 
in Figure [T2] This map shows clearly that, for the season shown, the skill of this 
forecasting system is greatest near the northern coast of South America. Maps of 
this type should prove useful in communication of operational forecasts because 
the reader's eye is draw to areas where the forecasting system has been shown to 
perform well. 

5. Recalibration of ternary probabilistic forecasts 

The decomposition diagram in Figure II lb , illustrates that significant improvement 
can sometimes be made to the mean-score by recalibrating the forecasts. The score 
of the original forecasts is ||P — 0|| 2 and so the aim is to produce recalibrated fore- 
casts p, calculated from the original forecasts via some specified functional form, 
whose score ||P — 0|| 2 is minimised. As an example, consider recalibrated ternary 
forecasts which are a quadratic function of the original ternary forecasts (Appendix, 
section [d]). Given a dataset of forecasts and observations, standard numerical opti- 
misation techniques can be used to find coefficients in the quadratic function which 
minimise the score of the recalibrated forecasts. The recalibration function can 
be interpreted by comparing Figure II lb and Figure lllb . Recalibration minimises 
the score by changing the reliability (that is, the root-mean-square length of the 



(c) Forecast maps including verification data 



circle radius oc 



yfZ-y/R 

Vz 



(4.4) 
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Ternary precipitation forecast 
Issued: Nov 1997 Valid: DJF 1997 




Figure 12. Probabilistic forecast map including verification data. Larger circles correspond 
to forecasting system performing well historically. No circle is plotted where forecasting 
system has performed worse than a climatological forecast. Colours assigned as in figure 
[JJ circle radius via equation 14.41 



red dipoles) . It is important to note that recalibration changes the forecasts (black 
dots) but not the observations (red circles) and so the rms length of the red-dipoles 
has been reduced from \R = 0.159 to \R = 0.092 solely by moving the position 
of black dots. The effect of this reduction can be seen in the decomposition dia- 
grams (top-left). Recalibration changes neither the root-uncertainty \/Tj — 0.577 
nor the root-resolution \/~Z = 0.185, but reduces the root-score from y/S = 0.569 
to y/S = 0.554. 

In this case it is immediately clear that the recalibrated forecasts (Figure ITTb . 
black dots) are clustered around an arc passing close to the climatology (blue cross) . 
It is interesting that this arc is similar to the contour a = 1 in Figure [BJd which ap- 
plies when the climatology and the forecast are both Gaussian with equal variance. 

Finally, the sharpness diagram (Figure II lb . top-right) shows the distribution 
of the recalibrated forecasts. It is produced by passing the sharpness diagram of 
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the original forecasts (Figure [TTr . top-right) through the quadratic recalibration 
function (Appendix section [dj . Regions of the sharpness diagram coloured grey 
indicate possible forecasts that were not originally. Regions coloured white indicate 
forecasts that can never be issued under this recalibration. 



6. Conclusions 

This article has outlined a novel procedure for visualising ternary probabilistic 
forecasts. The proposed maps convey through colour all of the information present 
in a ternary forecast. A colour palette printed next to the map can be used to 
deduce the ternary forecasts directly from the colour used. 

The proposed colour scheme always assigns the colour white to a forecast close 
to climatology, and strong colours to forecasts that differ greatly from climatol- 
ogy. The examples considered in this paper have concerned precipitation and so 
the 'wet' colour blue has been assigned to the above normal category A and the 
'dry' colour red to the below normal category B. It is straightforward to reverse 
these conventions if desired for other forecast variables such (e.g. temperature). 
The default palette (with 9 = 0) assigns strong red, yellow and blue respectively 
to forecasts that assign high probability to one of the three categories. Yellow has 
been chosen in preference to green in order to assist colour blind readers. A variety 
of palettes can be created by varying the parameters m and 0q in equation 13.41 

A novel visual interpretation of ternary forecasts has also been suggested for 
verification and recalibration under quadratic scoring functions. It has been shown 
that Brier scores correspond to squared distances in an equilateral triangle and 
Ranked Probability Scores correspond to squared distances in a right-angled trian- 
gle. Thus, the root-score, root-uncertainty, root-resolution and root-reliability can 
all be interpreted as root-mean square distances. Verification data can be visualised 
using the proposed ternary reliability diagrams (e.g. Figure [TTI) . which incorporate 
a decomposition diagram (top-left) and a sharpness diagram (top-right) to aid 
interpretation. The geometrical interpretation can also be applied to nonlinear re- 
calibration of probabilistic forecasts. 

It is hoped that the procedures outlined here will prove useful in the interpre- 
tation and communication of operational weather and climate forecasts. When no 
information on skill is available a map like that of Figure [7] can be produced. If skill 
information is available, it can be incorporated as in Figure [T2"l 
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Appendix A. 

Let the triangle in K 2 induced by the matrix L have vertices BNA (Figure [13]) and 
sides of length b, n and a: 



b = y/S(o N ;o A ), n = \/S(o A ;o B ), a = y/S(o B ;oN) 
From the cosine rule, the angle <fi is given by 



(Al) 



cos< 



Ian 



(A 2) 



It follows that a ternary forecast p e R 3 and the associated point Pel 2 within 
the induced triangle are related by: 



P = Mp; p = MP + o B 

where the transformation matrices are defined by 

-a'smcb acosd — n 



M 



a cos 4> n 
a sin d> 



M 



1 



an sin < 







(A3) 



(A 4) 



(a) Brier Score 

The Brier Score is defined by 

5(p; o) = | [(p B - o B ) 2 + (p N - o N f + (p A - o A ) 2 ] 
It follows that, for the Brier Score, 



1 



1 



L =7 = 1 
V2 \ l 



M = 



| 1 
0^0 



M = 



1 f 

ik 

1 vl 



(A 5) 



(A 6) 



The triangle induced by the Brier Score is an equilateral triangle with unit sides 
b = n = a = 1. 



(b) Ranked Probability Score 
The Ranked Probability Score lEpsteinl f|l969h : iMurphvi (|l969l Il97lh is defined 



by: 



^(P; ) = ^ [(pb - o B ) 2 + (pb +Pn - o B - o N ) 2 ] 



It follows that, for the Ranked Probability Score 
10 



L = -^= I 1 1 
I i i i 



u = ( I 



M = 



-1 -1 

2 

1 -1 



(A 7) 



(A 8) 



The triangle induced by the Ranked Probability Score is a right-angled triangle 
with sides b = a = 1/V2, n = 1. 
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(c) The uncertainty of a quadratic score 

For a quadratic scoring rule (equation [XT]) the uncertainty t/(q) is equal to the 
expected score when the climatology q is issued as the forecast. It follows that 

U(q) = v'q - q'L'Lq; where v = diag(L'L) (A 9) 

It follows that 

U(q) = iq^v - (q - q ) / L'L(q - q ); where q = -(!/£) _1 v (A 10) 

In the particular cases of the Brier and Ranked Probability Scores: 

U = { q q ^ ^ Brier SCOre ' ) (All) 

\ — Qb) + — <7a)) (Ranked probability score) 



(d) Quadratic recalibration 

An original forecast p' = (pb,Pn,Pa) is defined by ps and pa since pn = 
1 — pb — Pa- Define a recalibrated forecast p' = (pb,Pn,Pa) by 

Pb = Ci+ C 2 pb + C^PA + C±p B + C 5 p B PA + C 6 p 2 A 

Pa = C 7 + C 8PB + C 9PA + C 10 p 2 B + C llPB PA + C 12 p 2 A (A 12) 

Pn = 1-pB-pA 

Numerical values for the parameters C\ , . . . , C12 can be found by minimising the 
mean score 5"(p, o) of the recalibrated forecasts. 
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